Measuring syntagmatic Fixedness of Multi-Word Expressions

نویسندگان

  • Axel Herold
  • Katerina Stathi
چکیده

Syntagmatic fixedness is an important feature of multi-word expressions (MWE). However, syntagmatic fixedness is gradual and various semantic and syntactic relations hold among the parts of MWEs. This poses intriguing problems for lexicography, linguistic description and language processing. In this paper we propose a computationally inexpensive and intuitive approach to the measurement of syntagmatic fixedness based on positional co-occurrence data that is not easily captured by simple statistical significance tests. The types of relations between frequently and systematically co-occurring lexical items have been the subject of studies in phraseology, corpus lexicography and corpus linguistics. MWEs have been classified according to different criteria: the compositionality of their meaning, their syntactic structure (phrasal vs. sentential), their internal structure, their grammatical well-formedness, their communicative function, their metaphoricity. For classifications of MWEs cf. (Moon, 1998; Cowie, 1998; Burger, 1998). Types of MWEs most frequently cited include named entities, idioms, proverbs, similes, routine formulae and sayings. To this we might add conventional metaphors which often involve the co-occurrence of lexical items. In the following we will focus on three important concepts in this discussion, namely cooccurrence, collocation, and idiom. Our emphasis is on co-occurrence between tokens of the underlying corpus but other notions are possible. In general, co-occurrence is seen as the existence of words in structural or positional adjacency or proximity (Evert, 2005). The concept of co-occurrence is discussed in more detail in section 3. Whereas the term co-occurrence is basically used in a rather uniform way, the term collocation has been used in a variety of ways, which can be summarised as follows:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corpus-Driven Study of Multi-Word Expressions Based on Collocations from a Very Large Corpus

We present a corpus-driven approach to the study of multi-word expressions, which constitute a significant part of. As a data basis, we use collocation profiles computed from DeReKo (Deutsches Referenzkorpus), the largest available collection of written German which has approximately two billion word tokens and is located at the Institute for the German Language (IDS). We employ a strongly usag...

متن کامل

Measuring Similarity from Word Pair Matrices with Syntagmatic and Paradigmatic Associations

Two types of semantic similarity are usually distinguished: attributional and relational similarities. These similarities measure the degree between words or word pairs. Attributional similarities are bidrectional, while relational similarities are one-directional. It is possible to compute such similarities based on the occurrences of words in actual sentences. Inside sentences, syntagmatic as...

متن کامل

Multifunction Thesaurus For Russian Word Processing

A new type of thesaurus for word processing is proposed. It comprises 7 semantic and 8 syntagmatic types of links between Russian words and collocations. The original version now includes ca. 76,000 basic dictionary entries, 660,000 semantic and 292,000 syntagmatic links, English interface, and communication with any text editor. Methods of delivery enriching are used based on generic and synon...

متن کامل

The Semantics of the Word Istikbar (Arrogance) in the Holy Quran based on Syntagmatic Relations(A Case Study of Semantic Proximity and Semantic Contrast)

The word istikbar (arrogance) is one of the key words in the monotheistic system of the Quran, which has found a special status as a special feature of the opponents and adversaries of the call to the truth. Given the prominent role of this issue in the human life system and its provision of corruption and moral deviations, it is necessary to represent the nature of the elements that make up th...

متن کامل

Syntagmatic Kernels: a Word Sense Disambiguation Case Study

In this paper we present a family of kernel functions, named Syntagmatic Kernels, which can be used to model syntagmatic relations. Syntagmatic relations hold among words that are typically collocated in a sequential order, and thus they can be acquired by analyzing word sequences. In particular, Syntagmatic Kernels are defined by applying a Word Sequence Kernel to the local contexts of the wor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007